MASTER Privacy - preserving DNA sequence alignment

نویسنده

  • Sakina Asadova
چکیده

A significant increase in Big Data has triggered the need for joint computation, where people or organizations cooperatively exchange private inputs in order to conduct various computational tasks. These tasks can vary in wide range and occur between untrusted entities. As an example, case of competitor organizations working together for some project with need to jointly share some private database information can be shown. In order to protect organizations’ valuable and private assets jointly invested computations should be held securely. Nowadays, in order to carry out such computations usually at least one trusted entity should be chosen and informed about private inputs of the both parties. However, if entities are mutually untrusted and there is no party that can be trusted to perform a computation, they need a cryptographic trusted protocol which ensures privacy entirely. In the literature, above explained problem is solved with secure Multi-Party Computation(MPC ) which has a prime importance in cryptography. MPC performs computation in such a way that, output is ensured to be correct and cheating parties will not be able to reveal any information about the inputs of the honest parties. Even though problem of the MPC has been introduced and solved almost 35 years ago, practical real-world applications have been discovered in various research fields and computation domains only in recent years. One of the most sensitive application fields for MPC is the privacy-preserving database queries in healthcare sector. The problem is to investigate if some private search query exists in a database which has private contents(e.g. DNA sequences) need to be kept secret except query result that can be derived. Solving string matching problem has been widely researched in the literature, both with and without MPC techniques. However, solving approximate string matching problem under tight privacy concerns is not a trivial task to do. We have used a particular algorithm, namely BWT transform to research the problem of sequence comparison and applied MPC techniques in order to investigate applicability of the method and produce privacy-preserving DNA sequence alignment algorithms. We have implemented our protocols in Python using specific framework VIFF supporting MPC, where underlying protocols are based on Shamir secret sharing. Due to a judicious use of the secret indexing and masking techniques, we were able to implement the protocols in a recursive manner as in original implementation. In particular, we have identified and analyzed two different models for implemented inexact string matching problem: one model with private search query intended to be searched within public reference string and another model with both private search query and private reference string. For example, in real-world use case private search query can be DNA mutations representing particular illness and reference string can be a human genome. In order to highlight the importance of MPC, both models have been verified and ensured to protect obliviousness entirely. Additionally, one of the major goals of this study is to introduce and analyze a concrete approach to oblivious verification of the inexact string matching. In particular, approach has been achieved by application of the specific cryptographic concept(i.e. zk-SNARK, namely zero-knowledge succinct non-interactive argument of knowledge) which ensures perfect security due to the zero-knowledge proof. This verifiable computation guarantees correctness proof of the computation and provides full protection against private information disclosure by adversarial verifier even with infinite computational power. Privacy-Preserving DNA Sequence Alignment iii

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

gpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences

Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...

متن کامل

Eindhoven University of Technology MASTER Privacy - preserving DNA sequence alignment

A significant increase in Big Data has triggered the need for joint computation, where people or organizations cooperatively exchange private inputs in order to conduct various computational tasks. These tasks can vary in wide range and occur between untrusted entities. As an example, case of competitor organizations working together for some project with need to jointly share some private data...

متن کامل

Clustering-based Multidimensional Sequence Data Anonymization

Sequence data mining has many interesting applications in a large number of domains including finance, medicine, and business. However, Sequence data often contains sensitive information about individuals and improper release and usage of this data may lead to privacy violation. In this paper, we study the privacy issues in publishing multidimensional sequence data. We propose an anonymization ...

متن کامل

Cloud-Assisted Read Alignment and Privacy

Thanks to the rapid advances in sequencing technologies, genomic data is now being produced at an unprecedented rate. To adapt to this growth, several algorithms and paradigm shifts have been proposed to increase the throughput of the classical DNA workflow, e.g. by relying on the cloud to perform CPU intensive operations. However, the scientific community raised an alarm due to the possible pr...

متن کامل

Privacy-preserving Ontology Matching

Increasingly, there is a recognized need for secure information sharing. In order to implement information sharing between diverse organizations, we need privacypreserving interoperation systems. In this work, we describe two frameworks for privacy-preserving interoperation systems. Ontology matching is an indispensable component of interoperation systems. To implement privacy-preserving intero...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017